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ABSTRACT 

We present the scientific performance results of PynPoint, our Python-based software 
package that uses principle component analysis to detect and estimate the flux of 
exoplanets in two dimensional imaging data. Recent advances in adaptive optics and 
imaging technology at visible and infrared wavelengths have opened the door to direct 
detections of planetary companions to nearby stars, but image processing techniques 
have yet to be optimized. We show that the performance of our approach gives a 
marked improvement over what is presently possible using existing methods such as 
LOCI. To test our approach, we use real angular differential imaging (ADI) data 
taken with the adaptive optics assisted high resolution near-infrared camera NACO 
at the VLT. These data were taken during the commissioning of the apodising phase 
plate (APP) coronagraph. By inserting simulated planets into these data, we test the 
performance of our method as a function of planet brightness for different positions 
on the image. We find that in all cases PynPoint has a detection threshold that is 
superior to that given by our LOCI analysis when assessed in a common statistical 
framework. We obtain our best improvements for smaller inner working angles (IWA). 
For an IWA of ~0.29'' we find that we achieve a detection sensitivity that is a factor 
of 5 better than LOCI. We also investigate our ability to correctly measure the flux of 
planets. Again, we find improvements over LOCI, with PynPoint giving more stable 
results. Finally, we apply our package to a non-APP dataset of the exoplanet j3 Pictoris 
b and reveal the planet with high signal-to-noise. This confirms that PynPoint can 
potentially be applied with high fidelity to a wide range of high-contrast imaging 
datasets. 

Key words: planets and satellites: detection, methods: data analysis techniques: 
image processing 



1 INTRODUCTION 

The field of exoplanet research has undergone rapid growth 
with a variety of methods having been developed to detect 
and charactere the properties of planets orbiting stars other 



than our Sun. Among these, radial velocity (e.g., Mayor 
et al.||2011| [Gumming et al.||20Q8| [Howard et aLl|2QlQt and 



transit (e.g., Batalha et al.|2012 Borucki et al.|20l"0 ) meth- 
ods have yielded the majority of the planets that have been 
discovered thus far. Another approach is to image exoplan- 
ets directly. The difficulty with this method is that the flux 
of the planet is substantially smaller than that coming from 
the star it orbits. Hence, a simple image of the system will 
be overwhelmingly dominated by the star. This, in combina- 
tion with the small projected separation on sky between the 
planet and the host, makes this imaging approach to exo- 
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planet detection difficult. However, there have been discover- 
ies of a few remarkable exoplanets and exoplanet candidates: 



HR8799 bcde ([Marois et al.|2008[ [2010[), (3 Pic b ([Lagrange 


et al. 


2009 


2010), 2MASS1207 b (Chauvin et al.[ 


2005), 


LkCA15 b ( 


Kraus & Ireland|2012 ), 1RXS1609 b (Lafreniere 


et al.|2008), 2MASS044144 b (Todorov et al.|2010). Despite 



these successes a large number of dedicated planet search 
surveys have yielded null results (e.g., Masciadri et al.|2005 



Biller et al.|2007l[Kasper et al.|2007| [Lafreniere et al.|2007a 
Chauvin et al.|2010[[Heinze et al.|2010[ ). While one of the key 

conclusions from these surveys has been that massive gas gi- 
ant planets are rare in orbits with separations larger than a 
few tens of AU, the innermost regions around the stars are 
still largely unexplored as we lack the required contrast. To 
improve this situation, a number of technical methods have 
been developed that allow us to effectively suppress the flux 
of the star and thereby form the desired image of the planet. 
These innovations have been in the hardware and techniques 
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for acquiring the data, as well as the analysis methods for 
data processing. 

At the data acquisition stage, new developments in- 
clude new coronagraph designs (for recent overview see, e.g.. 



Guyon et al.|[2006t to suppress the flux and halo from the 
star at small inner working angles (IWA) and improved ob- 
serving strategies for differential techniques that allow for 
better modeling of the point spread function (PSF). Exam- 
ples of these methods are simultaneous differential imaging 
2004) and angular differential imaging 
2006|). The common goal of these ap- 



(SDI, Lenzen et al. 
(ADI, ]M"arois et aL 



preaches is to control the stellar PSF so as to reduce 'speck- 
les', which are residual features in the (final) images that 
can mimic and/or overshadow a planet. Upcoming new dedi- 
cated instruments for planet finding combine several of these 
methods and contrast enhancing techniques with extreme 
adaptive optics (AO) systems, yielding unprecedented PSF 
stability and contrast performances (e.g., SPHERE at the 
VLT, GPI at Gemini; [Beuzit et aT]|2006l [Macintosh et al.| 
2007|. 



Along with these improvements, new data analysis 
methods have been studied. This is a very important part of 
the process since inefficient methods will not allow us to take 
full advantage of the new higher quality datasets that are 
becoming available. On the other hand, highly efficient soft- 
ware solutions allow us to reanalyse existing data to extract 
more information. A good example is the widely used LOCI 
package (Lafreniere et al. 2007b), which is used to model 



and subtract the stellar PSF. This method significantly im- 
proves the contrast performance in ADI datasets compared 
to the classical ADI data reduction approach and, in the 
meantime, has also been be adapted to effectively subtract 
the sky background emission in thermal infrared datasets 
( Galicher et al.|2011 ). 

Here, we present the results of a new data analysis and 
PSF subtraction software package - called PynPoint - that 
we have developed. Our method shows significant improve- 
ments to the contrast performance at very small IWAs in 
ADI datasets. Compared to existing methods, we gain up to 
a factor of five in sensitivity at separations of ^0.29'' . Pyn- 
Point can be applied to a wide range of datasets, including 
any existing ADI dataset and does not require any special 
observing strategy or instrument setup. 

In section 2, we give a description of the dataset and 
simulations we used. In section 3, we give an overview of 
the main top-level steps used in PynPoint. In sections 4 
and 5, we show our performance in both planet detection 
and flux measurements. We compare our results with those 
we obtained using LOCI. In section 6, we apply PynPoint 
to a high-contrast dataset of the exoplanet /S Pic b. Our 
conclusions and discussion on future prospects are presented 
in section 7. 



2 DESCRIPTION OF TEST DATA 

To test and validate PynPoint, we created a suite of simula- 
tions based on real ADI data. The dataset we used consisted 
of images taken with the adaptive optics (AO) assisted high 
resolution camera - NACO - at the VLT ( [Lenzen et al.|2003| 



thy et al.|20Tol[Girard et al.|2010 ) coronagraph, which took 
place in April 2010. We observed the young, nearby debris 
disk host star HD 115892 in pupil stabilised ADI mode in the 
NB4.05 filter and used a detector integration time (DIT) for 
individual exposures of 0.5 s. A complete description of the 
observations and a first data reduction and analysis (done 
with LOCI) is presented in Quanz et al. (2011 ). Despite the 



fact that we did not detect any faint companions around 
HD 11 5892, this dataset is well suited for our current pur- 
poses since the APP already provides a good contrast perfor- 
mance close to the star using current reduction techniques. 
The full test dataset consists of 10,800 individual exposures 
contained in 54 data cubes (all images from 'hemisphere 1', 
see Quanz et al. (2011)) which corresponds to a total on- 



source integration time of 1.5 hours. The field rotation over 
the full image stack was ~ 50°. 

The basic data reduction steps (background subtrac- 
tion, bad pixel correction, and image alignment) are de- 



scribed in Quanz et al. (2011) and the only additional step 



Rousset et "aL][2003t . These images were taken during the 



commissioning of the apodising phase plate (APP; Kenwor- 



was to create postage stamp images from each exposure by 
cutting out the innermost 59x59 pixels centered on the star. 
From these postage stamps, we then created a suite of sim- 
ulations by inserting fake planets to the individual raw im- 
ages. Each of our simulations contains one planet at one of 
the three positions shown in Figure [l] For the image of the 
planet, we used a scaled version of an unsaturated image of 
HD 115892 ( this image was created from averaging over a 
stack of 200 individual unsaturated frames; see also jQuanzj 
eraI][2QTT] ). The full- width-half-maximum (FWHM) of the 
unsaturated PSF was ~4.5 pixels. By varying the brightness 
of the planet, we can study the sensitivity of our method as 
a function of the magnitude contrast between the central 
star and the planet at a a given position. 



3 OVERVIEW 

PynPoint is a Python-based software package that we have 
developed for processing high contrast imaging data of ex- 
oplanets. Once the data from an observation run has been 
reduced to a stack of postage stamps, four main steps are 
needed for our planet detection procedure. These are: 

(i) constructing a basis set for the analysis; 

(ii) fitting the stellar PSF to the individual images; 

(iii) correcting for the PSF and central star; and 

(iv) averaging over an image stack. 

We have implemented methods in our package for dealing 
with these four steps. In so doing, we have made improve- 
ments in all the steps, but in particular our main gain comes 
from an improved treatment of the first step. Specifically, we 
have used a Principle Component Analysis (PGA) approach 
to empirically build the basis set from the data. In the fol- 
lowing sections, we discuss each of these steps in more detail. 



3.1 Construct Basis Set 

3.1.1 Overview of Basis Functions 

One of the key challenges in analysing high contrast imaging 
data is to correct for the flux of the central star. The star is 
effectively a point source, so its image will be that of the PSF 
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Figure 1. The top panel shows an example of a PSF as observed 
in an individual frame. On the image we show the three positions 
of the planets that we used to test our performance. We will refer 
to these positions as PI (shown by the green circle symbol), P2 
(red diamond) and P3 (turquoise/blue square). The symbols show 
the starting positions of each of the planets and the arcs show the 
track of each planet during the ADI observing sequence. Note that 
in our simulations only one planet is used at a given time. The 
lower panel shows an example of one of the planets that we added 
to our simulated images. This planet is the brightest example (A 
NB4.05 = 8 mag). For the faintest example, we used (A NB4.05 
= 11 mag), i.e., ~16 times fainter. The unit of the x and y axes 
in the images is pixels, while the flux levels are given in detector 
counts. 



of the respective telescope-instrument combination altered 
by atmospheric effects (for ground-based observations). This 
PSF can be modeled in a number of ways, and one effective 
approach is to decompose it into a set of basis functions 
so that the star image can be described through a linear 
combination such that 



(1) 



where I(x) is the image of the PSF, (/)(x) is a given basis 
and ai is the coefficient for each basis function. Often it is 
convenient to work with orthonormal basis functions, such 



/ 



(l)i{x)(l)j{x) dx — 



(2) 



where 5ij is the Kronecker delta. This is because it is compu- 
tationally convenient to use such orthogonal basis functions, 
since it is easy to calculate the coefficients, a^, for a given 
image /(x), using 



/ 



ai ^ I{x)(j)i{x) dx. 



(3) 



While in this work we have focused on orthogonal complete 
basis sets, it should be noted that it is also possible to use 
over-complete basis sets, where the basis functions are not 
strictly orthogonal to each other. 

There are a number of popular basis set functions that 
have been widely used and studied. As an illustration, two 
such examples of basis set functions are: (i) Fourier - decom- 
position into sine and cosine functions; and (ii) Shaplets - a 
decomposition into Gaussian weighted Hermite polynomials 



(for example see Refregier 2003 ). Another approach, which is 
the one that we have adopted, is to empirically create a basis 
set from the data. Typically, this type of empirical approach 
leads to basis functions that are more efficient at expressing 
the underlying function than generic basis sets. Here, the 
efficiency typically refers to a measure of the number of co- 
efficients that would be needed so that residuals between a 
model and the original images are lower than a given thresh- 
old. Efficient basis functions will require a smaller number of 
coefficients than non-efficient ones. Since many of the basis 
sets that we would consider using are complete (or over- 
complete), meaning that they can be used to describe any 
function when an infinite series is used, we might think that 
we are free to use any of them. However, since in practice 
we have to contend with noise, which effectively truncates 
the series, the choice of basis sets becomes important (for 
an example of the impact of basis set choice, see |Kitching fc| 
|Amara||2009| . In this case, it is better to use efficient basis 
functions. 

In weak gravitational lensing, the impact and impor- 
tance of basis sets in modeling the PSF pattern on an image 
have been widely studied, due to the fact that a critical 
step in the weak lensing measurement process is to mea- 
sure and correct for the PSF at the positions of galaxies 
(for top-level overview of these processes, see [Bridle et al. 
2009; Amara 2011 ). In particular, Jee et al. (2007) have com- 
pared shapelet, wavelet and PC A methods and shown that 
for their application the empirical PCA method gave the 
best results. Though different in detail, since in weak lens- 
ing we work with a spatially varying PSF while in exoplanet 
direct imaging the PSF varies primarily in time for ground- 
based AO-assisted instruments, the key aspects of the two 
problems share many similarities. For this reason, we have 
focused our initial studies, which we present here, on PCA 
basis functions. 



3.1.2 PCA Basis Sets 

A brief discussion of the basic principles of PCA methods 



is given by Jee et al. (2007). In this section, we give a short 



summary of the key calculations and outline some of the 
key issues that need to be considered. We calculated the 
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principle components of our images using singular value de- 
composition (SVD, see Press et al. 2007| ). To do this, we 
constructed a two dimensional array S, where each row i of 
the array is a vectorised version of image i of the stack. This 
means that S has dimensions M x N, where N is the number 
of pixels in an image and M is the number of images in a 
stack. We calculate the SVD such that 

S = UWV'^, (4) 

where W is a diagonal matrix with positive (or zero) ele- 
ments. In this decomposition, V is a matrix containing the 
PCA elements that form an orthogonal basis set (U and V 
are column-orthogonal matricies). 

After running a series of tests, we were able to optimise 
our PCA decomposition approach by using a careful treat- 
ment of the data before applying the decomposition shown 
in equation |4] Three important steps, which can be seen in 
the top panels of Figure [2] are: 

(i) Subpixel Sampling. We have a large number of PSF 
realisations, with each one expected to have small offsets 
between the PSF centre (position of the star) and the pixel 
grid set by the detectors. This then allows us to reconstruct 
features over the stack at a resolution that is greater than 
that of any individual image. For instance, in the example 
shown in Figure [2] we have doubled the resolution of the 
original images to 118 x 118 pixels as compared with the 
input images (shown in Figure[T]) with a resolution of 59 x 59. 

(ii) Central Mask. We know that the central region dom- 
inates the flux of the image, and in our HD 115892 dataset 
the core is even saturated ( Quanz et al.||2011 ) and contains 
little useful information. We find that our analysis methods 
work best when this central region is masked. This is shown 
in the top left panel of Figure [2] 

(iii) Mean Subtraction. When constructing the PCA basis 
functions, we have the option of subtracting the mean image, 
which is generated from the entire stack, from each of the 
individual images before the decomposition. This should be 
a subtle effect, since if the stack has a mean image then this 
would likely be (or dominate) the first principle component. 
Hence, we could expect the step of removing the mean to 
have little effect. However, since the PCA basis functions are 
forced to be orthogonal to each other, including the mean as 
one of the PCAs forces the higher PCA functions to be or- 
thogonal to it. By first subtracting the mean, this condition 
is removed and we find that the resulting PCA functions 
work better. An example of a single mean subtracted image 
is shown in one of the upper panels of Figure [2] 

The bottom panels of Figure [2] show a subsample of the 
PCA functions for the example we have used in this study. 
Specifically, we show the PCA functions 1, 2, 4 and 8. In 
the upper center-right panel we show the reconstruction of 
a single mean-subtracted image using 100 PCA coefficients. 
In the top right panel, we show the residuals between the 
data and the model. 



3.1.3 Further Considerations 

One important consideration in empirically building the ba- 
sis set is to decide which data to use. For instance, if we 
use the same data to construct the PCAs as what we will 
use to detect the planets, then there is a danger that the 



PCA will incorporate the planet and so will tend to remove 
it from the analysis. There are a number of ways that we 
could consider to overcome such a problem. For instance, the 
data could be divided into a training sub-set and an analysis 
sample. Alternatively, we could try to build a generic basis 
set for a particular instrument using, for instance, archive 
data of different stars. Since the training set would be in- 
dependent of the data used to make the measurements, we 
reduce biases coming from the modeling. The disadvantage 
is that these PCA functions are likely be less efficient. 

A further consideration is that the matrix operations 
in equation^ can be computer memory intensive when the 
matrix S becomes large. Ultimately, this will likely lead to 
a trade-off between the number of images in a stack and 
the sub-pixel resolution we wish to recover. These optimisa- 
tions still require further refinement, but our finding in the 
work presented here suggests that using the same stack to 
construct the PCA as is used in the data analysis does not 
present a serious problem. We believe this is because 1) the 
planets are substantially fainter than the star, and 2) the 
planets "rotate" around the star in our ADI observations. 



3.2 Fitting the PSF 

For each image in our stack we are able to calculate the PCA 
coefficients using equation [S] When using the whole image, 
the only variable that needs to be specified by the user is 
the number of PCA coefficients that should be used in the 
fit. For the case that we are studying here, we find that very 
good results are achieved for roughly 100 PCA coefficients 
per fit. Beyond this we find that the improvements in the 
residuals are marginal. This process is the simplest fitting 
procedure that we could implement. Nonetheless, we have 
also incorporated a fitting scheme that allows the user to 
specify a mask. The reason to do this is that in some cases, 
for instance when we want to measure the flux of the planet, 
it is useful to mask out the planet location and only use the 
data in the rest of the image to fit a PSF model. This then 
limits the extent to which the planet itself is included in 
the fitting and avoids overcorrection and reduces possible 
biases in flux measurement. The disadvantage is that the fit 
inside the masked region, by necessity, will not be as good. 
Therefore, random error will increase. Another complicating 
factor is that once a mask is introduced, the resulting basis 
functions (multiplied by the mask) are no longer orthogonal. 
This means that we can no longer find the linear coefficients 
using equation [3] However, since the masks that we typically 
introduce tend to be small, equation [S] still gives a good first 
approximation to the best-fit coefficients. After this, we then 
refine the coefficients, using a minimisation algorithm, to 
minimise the residuals. 



I - imy X M, 



(5) 



where Im is the model PSF images for that frame, M is 
the mask image that blocks out the planet position and the 
sum is performed over all pixels in the image. This methods 
works because many of the features that a PSF has within 
the masked region are correlated to its properties in the 
unmasked regions. Given a good model for a PSF and a good 
fit in the unmasked region, we should be able to perform a 
reasonable reconstruction of the PSF where the mask sits. 
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Figure 2. The top panels show an example of our PSF fitting approach. From left to right, we show: (i) a masked version of a single 
image PSF where the centre and the corners of the image have been removed; (ii) the same PSF with the average over the stack removed; 
(iii) the reconstruction with 100 PCA coefficients; and (iv) the residuals between the mean subtracted PSF in the ffist panel and our 
reconstruction. In the bottom panels, we show four examples of our PCAs. 



This method of refining the linear coefficients can be further 
optimised by minimising the weighted residuals, Rw, given 

by 

Rw = J2^I-ImfxMxW, (6) 

where W is a weighting functions. In the work presented 
here, we used a Gaussian weighting function centered on 
the planet positions. This forces the coefficients to those 
that give a model that is in good agreement with the data 
in the region close to the mask (and planet) position. These 
features tend to be more strongly correlated with the fea- 
tures inside the mask, thus giving slightly better fits. The 
specific optimal weighting scheme is likely to depend on the 
details arising from, for example, the particular instrument 
and telescope being used. 

3.3 PSF Correction and Averaging Over Stack 

For each image in the stack, we want to remove the flux 
from the star and to leave the image of the planet. The star 
light can be well modeled as a convolution between a delta 
function and the PSF of the observations for that frame. 
Since the intrinsic image is simple, removing the star light in 
the convolved image corresponds to a simple subtraction of a 
PSF model from the image. This should then leave an image 
of the planet convolved with the PSF. Since the model that 
we fit does not perfectly reproduce a given PSF, the planet 
image will still be subdominant to the noise in the residuals. 
We still need to average over the stack of images to boost 
the signal-to-noise of the planet. To do this, we derotate the 



images in the stack to the same on-sky orientation based 
on the changes in the parallactic angle of each frame. We 
then average over the stack. As well as the mean flux at 
each point, we also compute the variance, which gives us a 
measure of the noise in the temporal direction at each point 
in the averaged image. 

Our final average image should then give us a good rep- 
resentation of the planet convolved with a PSF that is ef- 
fectively an average PSF over all the data. On top of this, 
there remains considerable small-scale spatial noise that can 
be seen as pixel-to-pixel variations. To deal with this and to 
boost the detectability of the planet, we suggest two ap- 
proaches to smoothing out these small-scale features. The 
first is to deconvolve the image with a crude estimate of the 
average PSF. This will have the effect of pulling all the flux 
of the planet into one pixel. The other approach, which we 
have adopted in the current work due to its simplicity, is to 
use a matched filtering scheme. In this approach, the image 
is convolved with an estimate of the PSF, and the value of 
the convolved image at the planet position is then a measure 
of the total flux of the planet. 



4 PLANET DETECTION 

We have investigated the efficiency of planet detection for 
the data presented in section [2] We do this by creating a 
suit of simulations with planets of varying brightness at the 
three positions shown in Figure[l] We refer to these positions 
as PI, P2 and P3, and they have star planet separations 
of 0.52'', 0.29'' and 0.52", respectively. It is worth noting 
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that PI and P2 lie in the high-contrast hemisphere of the 
AFP PSF where the diffraction rings are suppressed, while 
P3 lies in the bad hemisphere where the diffraction rings are 
enhanced (see Kenworthy et al. 2010). We varied the planets' 
brightness to achieve a star-planet contrast between 8.0 and 
11.0 mag with incremental steps of 0.5 mag and analysed this 
suit with both LOCI and the PynPoint procedure outlined 
in section [3l 



4.1 Data Reduction Using LOCI 

Before running the LOCI package, we median-combined ev- 
ery consecutive 10 exposures in each cube, which resulted 
in 19 averaged frames per cube or 1026 frames in total for 
the 54 cubes we observed. Given the short integration time, 
the field rotation between the individual exposures was neg- 
ligible, and thus the median-combined frames did not suffer 
significant smearing effects of the simulated planets. We then 
applied the LOCI package on the 1026 frames as described 
in iLafr eniere et 'al.| (2007b), using the following LOCI co- 
efficients (same naming convention as in the original pa- 
per): FWHM=4.5 px, N5=0.7b, dr=5 and Na^SOO. The 
PSF-subtracted frames were then de-rotated and median- 
combined to create the final LOCI-reduced images. 



4.2 PynPoint vs. LOCI 

For the PynPoint data reduction, a particular detail to 
note is that in fitting the PSF model we have not used the 
masking procedure discussed in section [3^ This is because 
performing a blind search over the image with a mask is com- 
putationally very intensive, and our current findings show 
that even without this step we are already able to reach 
results that outperform what LOCI is able to achieve. How- 
ever, we return to the masking procedure in section [5] when 
we focus the discussion on measuring the fiux levels of planet 
candidates. We also note that we were able to perform the 
PynPoint analysis using the full stack of 10,800 individual 
exposures. 

In Figure |3] we show LOCI and PynPoint results for 
the two cases where the contrasts between the star and the 
planet are 8.0 mag and 10.5 mag. These results all use plan- 
ets located at position P2, which is the inner most configu- 
ration separated 0.29^^ from the star. 

The LOCI data reduction is explained in the previous 
section. To generate the PynPoint results, we have created 
an image that we have specifically tailored to object detec- 
tion. We do this by creating a detection image. Id, which 
we produce by dividing the average of the residual images 
by the root of the variance image. 



Id 



(7) 



We do this because Ivar is a measure of the noise in 
each pixel and performing this operation produces an image 
with a more even noise distribution. This does suppress the 
signal in high noise regions, as it should, but in particular 
for a signal-to-noise analysis it does allow us to work more 
directly with the entire image rather than being totally con- 
founded to working in rings around the star as is typically 



done. This is important since the PynPoint detection im- 
ages have been convolved in the final step, as outlined in 
section |3.3| to allow us to effectively match filter. In the 
example shown in Figure [3] we used a Gaussian of width 
of a = 0.027^^ (1 pixel in original image resolution) as a 
convolution kernel. For the PynPoint images we have also 
applied an automatic detection routine, the results of which 
are shown by the contours. We see that for the bright planet 
(contrast of 8.0 magnitudes), the automatic algorithm eas- 
ily finds the input planet. For the fainter planet, we see that 
the planet signal becomes comparable to some of the highest 
noise peaks, and in the image we see that the planet is one 
of the three brightest peaks. By eye, we see in all cases that 
there seems to be a substantial improvement in detectabil- 
ity for the PynPoint images as compared with the LOCI 
results. 

In Figure [4] we quantify the improved performance of 
our PynPoint software package by showing the detection 
signal-to-noise as a function of planet magnitude for the 
three positions we are considering. In all cases we measure 
the signal from the peak value of the Gaussian convolved im- 
age at the planet position (for example, the peak value in the 
lower left panel of Figure [s]). For the noise, we measure the 
standard deviation along the circumference of a circle that is 
centred on the star and has a radius that passes the center of 
the planet. The solid curves in Figure^ show the PynPoint 
results, the dashed curves show the LOCI results and the 
different colours correspond to position PI (red), P2 (blue) 
and P3 (green). The triangular symbols in the figure show 
the points where the planet peak is no longer the brightest 
peak over the entire image. We were only able to produce 
these results, shown with the symbols, for the PynPoint 
images because the LOCI images were dominated by the 
noise in the central regions and there were no LOCI cases 
where the planet is the most dominant peak. Figure [4] also 
shows la, 3a and 5a in horizontal dotted lines. 

For the two outer planet positions (PI and P3), we find 
that using PynPoint leads to a 50% boost in the signal- 
to- noise, which is a substantial improvement. Seen another 
way, for a fixed signal-to-noise threshold we see that when 
the data is analysed using PynPoint we are able to detect 
planets that are between 0.5 to 1.0 magnitude fainter than 
what is possible with LOCI. For the close-in planet position 
P2 at a separation of ^ 0.29^^ the improvement is even more 
striking. At this position, we see that the PynPoint signal- 
to-noise is boosted by roughly a factor of five. This is a 
very significant improvement. In fact, in the LOCI analysis 
the planet at this position cannot be detected at any of the 
brightnesses we consider. However, using PynPoint we are 
able to detect the planet at 5cr down to a magnitude of 8.6 
and at 3a down to magnitude 9.5. The importance of this 
improvement is even more significant when we consider the 
fact that the regions closest to the star, where PynPoint 
performs best as compared with LOCI, are scientifically the 
most interesting ones. These regions, which are as close to 
the star as possible, are the most important ones in terms 
of planet searches since direct imaging surveys have already 
demonstrated that giant planets are rare at large orbital 
separations (beyond a few tens of AU, see, e.g., Lafreniere 
et al.||2007a| Chauvin et al.||2010 t), but the parameter space 
closer to the star is still largely unconstrained. 
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Figure 3. Images used to detect the planets in position P2. The 
left panels show the cases for a planet with magnitude contrast 
of 8.0, and the right panels show results for the planet with a 
contrast of 10.5. The top two panels show results for LOCI, while 
the lower panels are the results Id from PynPoint. 



4.3 Potential Future Improvements 

Once a possible planet detection is found, it then makes 
sense to reanalyse the data by placing a mask over the tar- 
get position and using the method that we outlined in sec- 
tion [3]2] Though computationally more intensive, this step 
is feasible, since it only needs to be performed at specific 
locations. With this in mind, it is interesting to consider 
what an optimal detection strategy might be. For instance, 
we could imagine that in a two-step process it would make 
sense to have a more relaxed, but clearly defined, criterion 
for the first detection (i.e. the step described in this section), 
such as the detection of all peaks higher than 3a rather than 
5(7, which is more typical. All these candidates can then be 
studied in more detail using the masking procedure, as we 
do in section |5] at which point more aggressive cuts could 
be applied. We have not yet fully investigated this optimi- 
sation, but it certainly warrants further systematic work in 
the future. 



5 MEASURING THE PLANET FLUX 

It is well known that LOCI introduces flux losses to com- 
panions ( Lafreniere et al^|2007b ). These flux losses depend 
on the choice of LOCI parameters and also on the angular 
separation and the brightness of the companion itself, thus 
making an accurate analytical correction difficult. Hence, in 
our next tests we studied the accuracy of LOCI and Pyn- 
Point at recovering the flux of the planet once its position 
is given. 

In the PynPoint case, this means that we want to use 
the residual images Ires and not the detection images /d, 
and we also use the planet masking procedure when model- 
ing the PSF to limit possible biases. The results are shown 
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Figure 4. Detection signal-to-noise (defined in section |4.2| as a 
function of the contrast difference between the central star and 
the planet. The solid curves show results for PynPoint, and the 
dashed curves show results for LOCI. The colours correspond to 
the three planet positions that we have used in this study. The 
horizontal dotted lines show the signal-to-noise thresholds of 5, 
3, and 1. 



in Figure [5] The colour scheme and line styles are the same 
as those in Figure ^ The x-axis of this figure shows the 
flux inside an aperture with a radius of 0.064^^ (^2.4 pixels 
in original image resolution) centered on the input planet 
image (e.g., see lower panel of Figure [T]). The y-axis shows 
the flux that is measured using the same procedure on the 
LOCI and PynPoint images. The diagonal dotted black 
line shows what we would expect in an idealised case where 
the measured flux is the same as the input flux. In general, 
we see that in all cases there is a bias between the input flux 
and measured flux. The important thing to note here is that 
the three PynPoint cases give reasonably stable off-sets, 
while the LOCI results show a large range in their perfor- 
mance. In the worse cases of PI and P2, LOCI underesti- 
mates the planet close to a factor of 5. While the two worse 
cases for PynPoint underestimate the flux by a factor of 
2. This combined with the better stability of the PynPoint 
results means that PynPoint is more likely that it can be 
calibrated. This leads to another example of better perfor- 
mance of our method. We believe that further improvements 
are possible through continued work and optimisation. 



APPLIED EXAMPLE 
13 PICTORIS B 



THE EXOPLANET 



To further test and validate PynPoint, we downloaded a 
high-contrast imaging dataset for the exoplanet /3 Pictoris 
b. The data are publicly available from the ESO archive and 
were initially used to confirm the existence of this young, 
massive planet orbiting the nearby A-type star /3 Pic |La-| 
grange et aI.|201Q| ). The dataset we used was taken on 2009 
December 26 with VLT/NACO in the L' filter in ADI mode. 
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Figure 5. Measured flux as a function of intrinsic planet flux. 
The black line shows the case where the measured flux is the 
same as that of the planet; the solid coloured curves show the 
results from PynPoint, while the dashed curves show the LOCI 
results. The colour scheme is the same as those used in Figure [4] 
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Figure 6. The exoplanet (3 Pictoris b observed in December 2009 
in the fllter as revealed with PynPoint (see text). The image 
has been normalised by the standard deviation (a) in a ring at the 
planet radius. The ±15° around the planet has not been included 
in the calculation of a. 



It consists of 80 data cubes, each containing 300 individ- 
ual exposures with a detector integration time of 0.2 s. The 
total field rotation amounted to ^44° on sky. Again, the 
basic data reduction steps (sky subtraction, bad pixel cor- 
rections and alignment of images) were done as explained in 
Quanz et al. (2011). The final postage stamps we used for 



our PynPoint analysis were 73x73 pixels in the original 
images size, but as explained in section |3.1.2| we doubled 
the resolution of the images (in this case to 146x146 pix- 
els). Our PynPoint analysis effectively has one significant 
free parameter that is set by the user. This is the number of 
PC A coefficients that are used to fit each image. However, 
we find that our results are fairly stable over a wide range 
of parameter choices. For /3 Pictoris b we are consistently 
able to detect the planet with a signal-to-noise in the range 
of 15 to 20 when changing this parameter by a factor of 3 
(between 20 and 60). We note that since the PSF of this 
non-coronagraphic dataset is less complex than that of the 
APP, fewer coefficients are needed than in the test datasets 
in the previous sections. For the results shown in Figure |6] 
we used 40 PC A coefficients. We see that the exoplanet /S 
Pictoris b is by far the strongest signal in the image and is 
clearly detected with a signal-to-noise of ^20 south-west of 
the star at a position angle of ^209°. 



7 SUMMARY AND OUTLOOK 

We have developed a new Python-based software package - 
called PynPoint - for analysing data for the direct imaging 
of exoplanets. This new method offers a number of improve- 
ments over what is currently available. In particular, we use 
a principle component analysis to model the PSF of the 
observations that we then use to subtract the flux coming 
from the central star, thus revealing the signal of a faint 



(planetary) companion. We have applied our technique to a 
set of simulations based on ADI data taken with NACO on 
the VLT. This dataset used an apodising phase plate, which 
suppresses the diffraction rings from the star on one side of 
the image. By inserting simulated planets into these data, 
we find that PynPoint achieves a detection signal-to-noise 
that is 50% larger than our LOCI reduction at separations 
of ^0.59^^ and up to a factor of five better for the smaller 
inner working angle of ^0.29^^ We also find that when mea- 
suring the flux of the planet, PynPoint gives more stable, 
and hence more accurate, results than what we find with 
LOCI. 

To demonstrate that PynPoint works also on non-APP 
data and that it reveals real exoplanets and not only sim- 
ulated ones, we have re-analysed publicly available L' data 
from the exoplanet /3 Pictoris b observed in December 2009. 
We clearly detect the exoplanet in our final image with a 
signal-to- noise close to ^20. 

Our method relies on empirically constructing a basis 
set from the data. For this reason, it should be fairly generic 
and work over a range of wavelengths. The simplicity of our 
method also means that the number of user-specified param- 
eters is low. Essentially, the only free parameter that needs 
to be chosen with care is the number of basis coefficients 
that are used to model the PSF. All other free parameters, 
such as the size of the masked region, can be automated, 
and their fine-tuning should not have a significant impact 
on the final results. 

We have written PynPoint in an object-oriented and 
modular way and we plan to continuously make improve- 
ments to specific sections as we further develop and test our 
method. In particular, we are focused on: (i) making sure 
that our basis set are efficient (i.e. sparse); (ii) improving 
our temporal averaging over the stack; and (iii) improving 
our spatial algorithms for denoting. We are also in the pro- 
cess of testing our method on different datasets. It will be 
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interesting to see how well PynPoint performs at shorter 
NIR wavelengths, where the Strehl ratio of the observations 
is lower than for the data presented here. Furthermore, we 
hope to obtain test datasets from different instruments to 
investigate how PynPoint can be optimised for individual 
observing programs. In so doing, we hope to soon be able 
to deliver a well-tested package that we can release pub- 
licly to the community, thus supporting the large, dedicated 
exoplanet surveys that will come online in the near future. 
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